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What is claimed is: 

l.y A system for processing a multimedia data file to 
provide information supporting user navigation of 
ltimedia data file content, comprising: 
aNcont< 



mu 



:ent parser to identify text and image content of 
\ 

a data fiile; 

an image processor for processing said identified 
image content to identify embedded text content; 

a text ss>rter for parsing said identified text and 
said identified^ embedded text to locate text items in 
accordance with predetermined sorting rules; and 

memory for sto\ing a navigation file containing said 
text items. 



15 2. The system of claim\l, wherein the navigation file 
links to at least one internal document object. 



20 



3. The system of claim 1, wherein the navigation file 
links to at least one external Ndocument object. 
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4. The system of claim 1, wherein the image processor 
comprises a black and white image processor comprising: 

a pixel smearing component reducing text to a 
rectangular block of pixeJSs; and 

an image filtering component for cleaning a smeared 
image . 

5. The system of claim 1, wherein the content parser 
applies text extraction rules to identify text and identify 

y10 a document structur*e^\ \rtierein the document structure 
M= defines a context f oVXidentif ied text. 



6. The system of claim 1, Wherein the content parser 
applies pre-defined hierarchical rules for determining a 
15 level of identified text. 



7. The system offNjlaim 1, wherein the image processor 
applies object templates to identify embedded text. 



20 



8. The system of claim 1, wherein the system refines a 
search resolution during a text identifying process to 
determine a location of the embedded text within an image. 
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9. The system of claim 1, wherein identified text 



comprises hyperlinks. 



10. A graphical User interface system supporting 
processing of a multimedia data file to provide information 
supporting user navigation of multimedia data file content, 
comprising : 

a menu generator for generating, 

one or more menus^permi 1 1 ing User selection of, an 
input file and format to be processed; and 

. . \ .... 

an icon permitting User initiation of generation of a 
navigation file supporting^ linking of input file elements 
to external documents by parsing and sorting text and image 
content to identify text f or \incorporation in a navigation 
file. 



11. The system of claim 10, wherein identified text 
comprises hyperlinks . 
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12. The system of claim 10, wherein the navigation file 

\ 

further comprises links to at least one internal document 
object . 



13 . A method of creating an anchorable information unit in 
a portable document format document, comprising the steps 
of: 

extracting a t'ext segment from the portable document 
format document ; 

determining a context of the segment, wherein the 
context is selected fr\om a context sensitive hierarchical 
structure; and 

defining the text segment as an anchorable information 

\ 

unit according to the context, 



14. The method of claim 13w wherein the portable document 
format document includes one. or more textual objects 
including and one or more nontextual objects, wherein the 
objects includes textual segments. 
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15. The method of cTaim 13, wherein the step of 
determining the contex^t further comprises the steps of: 

comparing the text segment to a plurality of known 
patterns within the portable document format document; and 

determining the context upon determining a matching 
the text segment and a known pattern of the portable 
document format document . 



'% 16. The method of claim 13, wherein the step of extracting 

S \ 

f 1 11 0 text further comprises the step of: 

^ \ 

M= extracting text form an\ underlying image of the 

portable document format document; 
i¥ determining a type for th^e image, wherein the type is 

^ one of a black and white image ,\ a grayscale image, and a 
15 color image; and 

processing the image according to the type 



17. The method of claim 13, wherein the portable document 
format document includes a known context sensitive 
20 hierarchical structure. 
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18. The method o\f claim 17, wherein the context sensitive 
hierarchical structure, including the anchorable 

\ 

information unit is searchable. 



19. The method of claim 13, wherein the context includes a 
location for the extracted text segment. 



20. The method of claim 13, wherein the step of 

\ 

determining a context further comprises the step of 
determining a location and a style of the text segment 



21. The method of claim 13 \ further comprising the step of 
storing an extracted text segment in a Standard Generalized 
Markup Language syntax using a\ predefined grammar. 



22. The method of claim 13, whertein the achorable 
information unit is automatically Vyperlinked. 



I..J. 
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23. A program stotage device readable by machine, tangibly 
embodying a program\of instructions executable by the 
machine to perform method steps for creating an anchorable 
information unit f ile\ f rom a portable document format 

5 document, the method steps comprising: 

parsing the portable document format document into 
textual portions and nonv-text portions; 
,,5 extracting structure\ from the textual portions and the 

y3 non-text portions; 
1^10 determining text within textual portions, and text the 

rr * non-text portions; and 

hyperlinking a plurality\of keywords within the 
\ textual portions and non-text portions to a related 
l document . 
15 

24. The program storage device of claim 23, wherein the 
step of parsing further comprises the step of 
differentiating color image content \f rom black-and-white 
content . 



20 
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25. The program storage device of claim 23, wherein the 



step of extracting further compr 



r comprises/ the st 



/ 



eps of: 



determining a level for extracted textual portions; 
associating the context with/the text; and 
pattern matching extracted text to the portable 

document format document to determine a context and a 

location . 



26. The program storage device of claim 25, wherein the 
level is one of a paragraph,/ a heading and a subheading. 



27. The program storage device of claim 25, wherein the 
step of pattern matching further comprises the steps of: 

lian/ font size foi 



determining a medi 



document format document; 



/ 



)r the portable 



I 



comparing a font size of the extracted text to the 
median font size for the portable document format document; 
and 

determining a context according to font size. 
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28. The program storage device of claim 23, wherein the 



step of hyperlinking further c: 
the anchorable information un:. 
of keywords are anchorable inj : 



omprises the step of creating 
t file, wherein the plurality 
ormation units. 



